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Abstract; Finger millet ( E . coracana (L.) Gaertn.) provides food for millions of people in Africa and Asia. In this study, 
sequence data were mined at the database of National Center for Biotechnology Information (NCBI) with the aim of 
developing polymorphic expressed sequence tags simple sequence repeat (EST-SSRs) markers. Three selected markers 
which showed clear polymorphism in pre-testing with 5 accessions were used to characterize some randomly selected 48 
accessions from the pool of finger millet core set. The polymorphic information content (PIC) of the developed markers 
gave a value range of 0.6741 for marker UH-Ec-931 to 0.7658 for marker UH-Ec-958. The mean PIC value of 0.7171 was 
recorded. Marker UH-Ec-958 showed 13 alleles per locus while marker UH-Ec-956 showed 20 alleles per locus. The mean 
average allele per locus was 17. Following Nei’s approach, the mean gene diversity value of 0.7638 was captured by the 
three markers. Cluster analysis for the 48 selected accessions of finger millet showed four major clusters. Accessions from 
Zimbabwe and Zambia are distributed on the cluster I. Accessions from India are mostly found in cluster IV. Accessions 
from Nepal were found mostly on cluster III while Ugandan accessions are found in cluster II and III respectively. Our 
investigation showed that the developed EST-SSRs are quite effective in unraveling the nature of diversity in our studied 
population. 
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1. Introduction 

Finger millet (E. coracana (L.) Gaertn.) is a tetraploid 
(2n= 4x= 36; genome constitution AABB) belonging to the 
grass family of Poaceae [1] . The crop is grown as an 
annual robust grass on marginal lands of Africa and Asia 
providing food for millions of households. The grains are 
rich source of protein, fiber, minerals, and amino acid [2]. 
The grains are also used in therapeutic management of 
patients with immune related problems in east Africa [2]. 
The nutritional quality inherent in finger millet makes it an 
ideal supplement for expectant mothers, breastfeeding 
mothers, children, the sick and diabetics [3]. The grains are 
used in beer and liquor production [4]. Developing superior, 
high yielding and disease tolerance cultivars that are highly 
adaptable to varying environmental conditions can be 
achieved through understanding genetic variations in 



available germplasm. Several studies have assessed the 
genetic diversity in finger millet using DNA based 
molecular markers like RAPD, RFLP and SSR [5-8]. 
Despite these achievement there is need to intensify more 
research efforts towards developing markers for crop 
improvement. SSR based EST libraries are powerful tools 
for genetic research in genetic variation, gene tagging and 
evolution, mapping and analysis of quantitative traits [9]. 
ESTs derived SSRs are quickly obtained, unbiased in their 
repeat type, and they have a higher probability of being 
functionally associated with differences in gene expression 
than the genomic DNA or cDNA derived SSRs [10]. 
Furthermore, SSRs acquired through genomic DNA or 
cDNA sequences are limited to those probed SSR motifs 
thus restricting the marker utility [10]. EST-SSRs are 
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obtained from transcribed regions of DNA, and are more 
conserved and have a higher rate of transferability to close 
related specie and genera [9]. The mining of these abundant 
resources provides an opportunity to rapidly expand the 
database of molecular markers in finger millet at a minimal 
cost. The objective of this study is to (i) develop EST-SSR 
markers for finger millet and (ii) validate their polymorphic 
potential by using them to describe the diversity of 48 
randomly selected representatives from a core set. 

2. Material and Methods 

2.1. Plant Material and DNA Extraction 

The plant material consists of 48 randomly selected 
accessions from the pool of 622 core accessions of finger 
millet [11]. This comprised entries from 7 countries namely 
Kenya, Malawi, Zambia, Uganda, India, Zimbabwe and 
Nepal (Table 1). The seeds were planted in greenhouse at 
the University of Hohenheim, Germany in spring 2009. The 
leaves were harvested at two-leaf stage at 15 days after the 
planting (DAP) and a modified CTAB protocol was used 
for genomic DNA extraction from the tissues [12]. 
Standard lambda DNA was used to determine the DNA 
concentration on 3% agarose gels. The relative purity of the 
extraction was validated using NanoDrop ND-1000 
(NanoDrop Technologies Inc., USA). A final working 
concentration of 20ng/pl was adjusted from the stock for 
each sample. DNA samples were stored at a temperature of 
-20°C. 



Table 1 . List of core collections of E. coracana used in the study 



Serial number 


Genotype identification 
number 


Country of origin 


1 


2384 


Kenya 


2 


2399 


Kenya 


3 


2416 


Kenya 


4 


2425 


Kenya 


5 


2437 


Kenya 


6 


2440 


Kenya 


7 


2476 


Kenya 


8 


2487 


Kenya 


9 


2503 


Kenya 


10 


2551 


Kenya 


11 


2606 


Malawi 


12 


2608 


Malawi 


13 


2622 


Malawi 


14 


2633 


Malawi 


15 


2652 


Malawi 


16 


2732 


Malawi 


17 


2857 


Zambia 


18 


2861 


Zambia 


19 


2869 


Zambia 


20 


2871 


Zambia 


21 


2896 


Zambia 


22 


3779 


Uganda 


23 


3780 


Uganda 


24 


3808 


Uganda 


25 


3817 


Uganda 



Serial number 


Genotype identification 
number 


Country of origin 


26 


3826 


Uganda 


27 


3947 


Uganda 


28 


3973 


Uganda 


29 


4057 


Uganda 


30 


2299 


India 


31 


2322 


India 


32 


2264 


India 


33 


2212 


India 


34 


2223 


India 


35 


3062 


India 


36 


3127 


India 


37 


3135 


India 


38 


4245 


Zimbabwe 


39 


4274 


Zimbabwe 


40 


4296 


Zimbabwe 


41 


4312 


Zimbabwe 


42 


4339 


Zimbabwe 


43 


4383 


Zimbabwe 


44 


4403 


Zimbabwe 


45 


5542 


Nepal 


46 


5635 


Nepal 


47 


5537 


Nepal 


48 


5896 


Nepal 



2.2. Data Mining for Microsatellites 

The databases of the National Center for Biotechnology 
Information (NCBI) were screened for SSRs. Similarly, a 
total of 194 nucleotide and 1,927 EST sequences were 
registered for E. coracana at NCBI databases. Using the 
RepeatMasker Open-3.0 program of the Institute for 
Systems Biology [13] the sequences were screened for 
interspersed repeats. The parameters were set for the 
detection of perfect di, tri, tetra, penta, and hexa-nucleotide 
motifs with a minimum repeat length of 10, 7, 6, 5 and 4 
respectively. The output of the program is a detailed 
annotation of the repeats that are present in the query 
sequence as well as a modified version of the query 
sequence in which all the annotated repeats have been 
masked [13]. The output was further controlled visually to 
ensure its conformity with the set parameters. 

2.3. Oligonucleotide Primer Design and Blasting for 
Specificity 

Primers were designed around the flanking regions of 
SSRs of interest using the primer premier 5.0 software. 
Because of the need for amplification specificity and ease 
of primer template annealing, parameters were set for 
primer length of 17 to 24 bp. The program further estimates 
the required annealing temperature based on the melting 
temperature of the constituent bases. Primer-template 
overlap was avoided during primer design by ensuring that 
the region to be amplified was between the forward and 
reverse primer pair. Efforts were made to reduce the 
occurrence of secondary structures such as dimers and 
hairpins. The designed primers were subsequently 
compared for sequence similarity against published primers 






44 



Oscar Nnaemeka Obidiegwu et al.: Development and Genotyping Potentials of EST-SSRs in Finger 

Millet ( E . Coracana (L.) Gaertn.) 



sequences. This was done using the Basic Local Alignment 
Search Tool (BLAST). Default parameters for minimum 
homology ratio, length and power threshold were used for 
the blast. Sequence alignments are significant, if its “power” 
exceeds 7 Standard Deviation units (SD). 

2.4. Optimization ofPCR for Designed Primers 

The PCR was set up in a 25 pi reaction mixture 
constituting of; 1 x PCR buffer (1.5 mM MgCL), 20 ng 
DNA template, 250 nM of each of the forward and reverse 
primers, 0.2 mM dNTPs, and 0.5 U of Taq DNA 
polymerase (GENAXXON biosciences). The mixture was 
assembled on ice. Optimum annealing (T A ) temperature 
was determined after first running the PCR on a gradient 
thermo-cycler (MJ Research, Inc), which allows a 
temperature range of 5°C above and below the estimated 
T A of the primers. The PCR product was evaluated on a gel 
and the clearest band within the expected fragment range 
was chosen as the optimum T A for the amplification. 
Optimum conditions for the MgCl 2 , dNTPs, and Taq DNA 
polymerase and the template DNA concentrations were 
determined empirically, after repeated trials. The primers 
whose PCR products were specific using pre-selected 5 
accessions were used to genotype 48 accessions of the 
finger millet. 

2.5. Setting up PCR using Labelled Primers and Capillary 
Electrophoresis 

Following a successful optimization for the PCR 
conditions, the forward primers were labeled with 
fluorescent dyes namely, tetrachloro-6-carboxyfluorescien 
(TET ; green), hexachloro-6-carboxyfluorescein (HEX; black) 
or 6-carboxyfluorescein (6-FAM; blue). The labeled primers 
were used in a new PCR amplification using DNA templates 
from the 48 accessions of finger millet. A MegaBACE 
sequencer (Amersham Biosciences) was used in the 
separation of the labeled PCR product. Three differently 
labeled PCR products (TET, HEX, and FAM) were pooled 
per run on a 96 well AB1 plate. An energy transfer dye 
standard (ET400-R, Amersham Biosciences) was used for 
fragment size estimation. The final cocktail for capillary 
electrophoresis constituted of 0.6 pi of multiplexed PCR 
products and 5 pi of the diluted ET-ROX standard (1:20 
dilution with loading solution). The multiplexed product was 
denatured at 94°C for 1 minute, cooled and centrifuged for 
another 1 minute at 2,500 rpm and was loaded in the 
MegaBACE. The run time lasted for 75 minutes. At the end 
of the capillary electrophoresis, the information on the peak 
value of the amplified PCR fragments was assessed using the 
MegaBACE fragment profiler (Amersham Biosciences) 
software. Varying allele peak value for a given locus across a 
given population corresponds to the possibility of varying 
lengths of the repetitive base sequences of the SSR. Allele 
calling was performed by selecting the allele peaks from the 
sized peak list for each trace. The peak selection was based 
on the fragment size range of the respective marker and the 



expected peak pattern of the SSRs. Peak selections that fall 
within the same allele bin of the fragment profiler were 
called as one allele. In the event of multiple stuttering peaks, 
alleles with the highest fluorescent intensity were called. 

2. 6. Statistical Analysis 

The generated data was analyzed using the software Tool 
for Population Genetics Analysis (TFPGA) version 1.3 [14]. 
to calculate the genetic distance between accessions using 
Nei’s average gene diversity measure [15]. Polymorphic 
information content (PIC) of each of the developed markers 
was calculated [16]. A dendrogram was constructed using 
DARwin 5.0. 

3. Results and Discussion 

3.1. Informativeness of the Developed EST-SSR Markers 
for Finger Millet 

Out of the 70 SSR enriched sequences found, primers 
were developed for 45 of the sequences. Majority of the 
sequences had a lot of secondary structures and primer 
template-overlap. PCR conditions were further optimized for 
28 of the primers but only 3 primers showed clear 
polymorphism across the pre-selected 5 accessions and 48 
accessions of finger millet on 1.5 % agarose gel. The 3 
primers were derived from di-nucleotide sequences (Table 2). 
The polymorphic information content (PIC) of the developed 
markers had a value range of 0.6741 for marker UH-Ec-931 
to 0.7658 for marker UH-Ec-958. The mean PIC value is 
0.7171 (Table 3). Marker UH-Ec-958 showed 13 alleles per 
locus while marker UH-Ec-956 showed 20 alleles per locus. 
The mean average of alleles per locus was 17 (Table 3). 
Following Nei’s approach, the mean gene diversity value of 
0.7638 was captured by the three markers from the 50 
populations of finger millet studied. The highest diversity 
value of 0.7970 was captured with Marker UH-Ec-958 while 
the minimum score of 0.7337 was recorded from marker 
UH-Ec-931 (Table 3). The three developed EST-SSRs used 
for the characterization revealed a total of 51 alleles with an 

average of 1 7 alleles per locus. This average number per 
locus represents a high value considering the findings of 
Babu et al., (2007) who generated an average allele of 10.58 
per locus in their studied population. Gupta et al., (2010) 
used 10 RAPD primers to get an average of 8.6 alleles per 
locus while with 10 inter simple sequence repeats (1SSR), 
they generated 5.7 allele per locus. These previous results 
when compared to our findings shows that our newly 
developed markers are highly informative in terms of 
capturing allelic richness. The polymorphic information 
content (PIC) of the developed markers gave a value range of 
0.6741 for marker UH-Ec-931 to 0.7658 for marker UH-Ec- 
958. The mean PIC value is 0.7171. This result is high when 
compared to finding of [17,8,18] who reported a maximum 
PIC of 0.50, 0.50 and 0.26 respectively.. The mean gene 
diversity value of 0.7638 was captured by the three markers 
from the48 populations of finger millet studied. The highest 
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diversity value of 0.7970 was captured with Marker UH-Ec- 
958 while the minimum score of 0.7337 was recorded from 
marker UH-Ec-931. This high genetic variation within the 
48 randomly selected accessions from the core set of finger 
millet indicates substantial variance for genetic improvement. 



This result further shows that the morphological and 
phenotypical descriptors used for the development of the 622 
core collections of finger millet as reported by [11]. was 
optimal enough and captured as much diversity. 



Table 2. Developed ESTs derived SSR markers for finger millet and their characteristics 



Marker 


Foward primer (5 '-3') 


Reverse primer (5 '-3') 


Repeat motif 


Annealing 

Temp[°C] 


Expected 
size (bp) 


UH-Ec-956 


TCCGTGTTCGGTTGCT 


ATGGGTTCACTGACTCTGC 


(AG) 23 + (AC) 11 


54 


190 


UH-Ec-958 


AACGATCTGGCCTTCCG 


TGCCGTGCTGCTCCTCT 


(GA) 21 


54 


191 


UH-Ec-931 


GGAAGTTATCACCAGAA 


AGACGGACAAATACACA 


(TC) 14 


54 


236 



Table 3. Result of statistical analysis for E. coracana 



Marker 


N a 


PIC 


J 


UH-Ec-956 


20 


0.7115 


0.7606 


UH-Ec-958 


13 


0.7658 


0.7970 


UH-Ec-931 


18 


0.6741 


0.7337 


Mean 


17 


0.7171 


0.7638 



N A =Number of allele 
PIC=Polymorphic information content 
J =Nei's gene diversity 



3.2. Cluster Analysis 

Cluster analysis for the 48 selected genotypes of finger 
millet is given in Figure 1. The cluster analysis from the 
weighted neighbor joining dendrogram generated 4 
clusters (Figure 1). Cluster 1 and II are quite close in 
similarity. Genotypes from Zimbabwe and Zambia are 
distributed on the cluster 1. Accessions from India are 
mostly found in cluster IV. Accessions from Nepal were 
found mostly on cluster III while Ugandan accessions are 
found in cluster II and III respectively.. Cluster I and II are 
quite close suggesting a close ancestral lineage. Cluster I 
major entries are from Zimbabwe and Zambia respectively. 
The unique placement of most accessions from Zimbabwe 
in cluster 1 suggests limited introduction and cross breeding 
and perhaps limited adaptation of foreign accessions into 
the region. Ugandan accessions are found only in clusters II 
and III. This draws to hypothesis that accessions in the 
regions must have had limited or minimal success in 
hybridizing with accessions from other regions. A second 
hypothesis could be that the domestication process has been 
restricted to a particular region. East Africa is considered a 
primary center of diversity [4]. and spread of accessions 
from Zimbabwe, Uganda and Zambia must not have been 
favored by varying environmental variables. Accessions 
from Nepal are mostly observed in cluster III suggesting 
distinctiveness in this region and minimal dispersion. India 
accessions are found mainly on cluster IV suggesting a 
closed breeding system. Cultivars introduced from Africa 
into India through the sea trade of around 1000 BC must 
have been shaped by adaptive forces of evolution. India had 



earlier been known as a secondary center of diversity [19]. 
The accessions from Kenya are fairly represented within all 
clusters suggesting that dispersion from Kenya to other 
regions of cultivation has been quite successful. Perhaps 
human migration through the developed tourist industry 
and sea access in Mombassa, Kenya must have facilitated 
this. Our developed EST-SSRs are quite effective in the 
evaluation of diversity in our studied population. These 
SSRs represent a significant contribution to the enrichment 
of markers. The exploitation of these markers will 
contribute to the genetic advancement of finger millet and 
closely related genera. 




O Nepal 

Figure 1. Weighted neighbour-joining dendrogram illustrating genetic 
distance of 48 populations of E. coracana 
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